Group and sparse group partial least square approaches applied in genomics context

نویسندگان

  • Benoit Liquet
  • Pierre Lafaye de Micheaux
  • Boris P. Hejblum
  • Rodolphe Thiébaut
چکیده

MOTIVATION The association between two blocks of 'omics' data brings challenging issues in computational biology due to their size and complexity. Here, we focus on a class of multivariate statistical methods called partial least square (PLS). Sparse version of PLS (sPLS) operates integration of two datasets while simultaneously selecting the contributing variables. However, these methods do not take into account the important structural or group effects due to the relationship between markers among biological pathways. Hence, considering the predefined groups of markers (e.g. genesets), this could improve the relevance and the efficacy of the PLS approach. RESULTS We propose two PLS extensions called group PLS (gPLS) and sparse gPLS (sgPLS). Our algorithm enables to study the relationship between two different types of omics data (e.g. SNP and gene expression) or between an omics dataset and multivariate phenotypes (e.g. cytokine secretion). We demonstrate the good performance of gPLS and sgPLS compared with the sPLS in the context of grouped data. Then, these methods are compared through an HIV therapeutic vaccine trial. Our approaches provide parsimonious models to reveal the relationship between gene abundance and the immunological response to the vaccine. AVAILABILITY AND IMPLEMENTATION The approach is implemented in a comprehensive R package called sgPLS available on the CRAN. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving biological activity prediction of protein kinase inhibitors using artificial neural network and partial least square methods

Introduction: Protein kinase causes many diseases, including cancer; therefore, inhibiting them plays an important role in the treatment of many diseases. Traditional discovery inhibitors of this enzyme is a time-consuming and costly process. Finding a reliable computer-aided drug discovery tools which can detect the inhibitors will reduce the cost. In this study, it is attempted to separate ki...

متن کامل

Partial Least Square and Parallel Factor Analysis Methods Applied for Spectrophotometric Determination of Cefixime in Pharmaceutical Formulations and Biological Fluid

In this study, the direct determination of cefixime as an anti-bacterial agent, in pharmaceutical formulations, urine and human blood plasma was conducted based on spectrophotometric measurements using parallel factor analysis (PARAFAC) and partial least squares (PLS). The calibration set was composed of fourteen solutions in the range of 0.50- 9.00 µg mL-1. PLS models were calculated at each p...

متن کامل

Improving biological activity prediction of protein kinase inhibitors using artificial neural network and partial least square methods

Introduction: Protein kinase causes many diseases, including cancer; therefore, inhibiting them plays an important role in the treatment of many diseases. Traditional discovery inhibitors of this enzyme is a time-consuming and costly process. Finding a reliable computer-aided drug discovery tools which can detect the inhibitors will reduce the cost. In this study, it is attempted to separate ki...

متن کامل

Partial Least Square and Parallel Factor Analysis Methods Applied for Spectrophotometric Determination of Cefixime in Pharmaceutical Formulations and Biological Fluid

In this study, the direct determination of cefixime as an anti-bacterial agent, in pharmaceutical formulations, urine and human blood plasma was conducted based on spectrophotometric measurements using parallel factor analysis (PARAFAC) and partial least squares (PLS). The calibration set was composed of fourteen solutions in the range of 0.50- 9.00 µg mL-1. PLS models were calculated at each p...

متن کامل

Comparing partial least square approaches in a gene- or region-based association study for multiple quantitative phenotypes.

On thinking quantitatively of complex diseases, there are at least three statistical strategies for association studies: one single-nucleotide polymorphism (SNP) on a single trait, gene or region (with multiple SNPs) on a single trait, and gene or region on multiple traits. The third approach is the most general in dissecting genetic mechanisms underlying complex diseases underpinning multiple ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 32 1  شماره 

صفحات  -

تاریخ انتشار 2016